AIbase
Product LibraryTool Navigation

Search AI Products and News

  • AI News
  • AI Tools
2025-04-16 11:24:23.AIbase

OpenAI Acquires Context.ai Team to Enhance AI Model Evaluation

2025-04-10 09:47:04.AIbase

OpenAI Launches Pioneers Program to Redefine AI Model Evaluation

2025-04-09 10:29:31.AIbase

OpenAI Launches Evals API: Ushering in a New Era of Programmatic AI Model Testing

2025-03-21 11:48:03.AIbase

High School Student Creates AI Model Evaluation Website Using Minecraft

2025-03-21 09:45:00.AIbase

Minecraft Transformed into an AI Arena: High School Student Builds Innovative Model Evaluation Platform

2025-01-10 15:49:29.AIbase

The Glorious GLM-4-9B Model Achieves Only 1.3% Hallucination Rate, Winning First Place in Global Large Model Evaluation

2024-12-05 14:45:53.AIbase

Byte's New Code Model Evaluation Benchmark 'FullStack Bench'

2024-10-09 15:51:44.AIbase

AI Video Generation Model Evaluation Report: Minimax Text Control is the Strongest, Ling 1.5 Can Master “Water Pouring”

2024-09-29 15:33:05.AIbase

Salesforce AI Launches New Large Language Model Evaluation Family SFR-Judge Based on Llama3

2024-08-13 08:11:01.AIbase

The Compass Arena, a Large Model Evaluation Platform, Adds a Multi-Modal Large Model Competition Section

2024-08-07 14:14:43.AIbase

Meta Launches 'Self-Taught Evaluator': NLP Model Evaluation Without Human Annotation, Outperforming Common LLMs Like GPT-4

2024-07-02 10:38:02.AIbase

Anthropic Launches Initiative to Fund Development of New AI Benchmarking Tools

2024-06-20 11:20:15.AIbase

Alibaba Qwen2-72B Tops HELM Ranking: Performance Surpasses Llama3-70B

2024-03-07 03:52:56.AIbase

AI Model Evaluation Company Points Out Serious Infringement Issues with GPT-4, Microsoft Engineers Express Concerns Over Image Generation Features

2023-11-30 09:52:30.AIbase

Amazon AWS Launches Human Benchmark Testing Team to Improve AI Model Evaluation

2023-11-29 09:08:23.AIbase

"Baimao Battle" Family's First, When Will Cheating in Large Model 'Scoring' Stop?

2023-11-02 15:21:41.AIbase

Ant Group Releases Benchmark for Large Model Evaluation in the DevOps Field

2023-09-25 09:54:21.AIbase

Investigation into the Chaos of Large Model Evaluation: Parameter Scale Does Not Represent Everything

2023-08-29 10:09:08.AIbase

August Rankings! SuperCLUE Releases Latest Rankings for Chinese Large Model Evaluation Benchmark

2023-08-18 10:04:45.AIbase

AI Startup Arthur Releases Open Source AI Model Evaluation Tool Bench